Data Set:

Pokemon dataset contains information on a total of 801 Pokemon.

it includes:

Data are downloaded from: https://www.kaggle.com/rounakbanik/pokemon/data

Packages needed:

##         names     percentage_male     height_m        weight_kg     
##  Abomasnow :  1   Min.   :  0.00   Min.   : 0.100   Min.   :  0.10  
##  Abra      :  1   1st Qu.: 50.00   1st Qu.: 0.600   1st Qu.:  9.00  
##  Absol     :  1   Median : 50.00   Median : 1.000   Median : 27.30  
##  Accelgor  :  1   Mean   : 55.16   Mean   : 1.164   Mean   : 61.38  
##  Aegislash :  1   3rd Qu.: 50.00   3rd Qu.: 1.500   3rd Qu.: 64.80  
##  Aerodactyl:  1   Max.   :100.00   Max.   :14.500   Max.   :999.90  
##  (Other)   :795   NA's   :98       NA's   :20       NA's   :20      
##        hp             attack          defense           speed       
##  Min.   :  1.00   Min.   :  5.00   Min.   :  5.00   Min.   :  5.00  
##  1st Qu.: 50.00   1st Qu.: 55.00   1st Qu.: 50.00   1st Qu.: 45.00  
##  Median : 65.00   Median : 75.00   Median : 70.00   Median : 65.00  
##  Mean   : 68.96   Mean   : 77.86   Mean   : 73.01   Mean   : 66.33  
##  3rd Qu.: 80.00   3rd Qu.:100.00   3rd Qu.: 90.00   3rd Qu.: 85.00  
##  Max.   :255.00   Max.   :185.00   Max.   :230.00   Max.   :180.00  
##                                                                     
##  base_egg_steps  base_happiness    capture_rate    experience_growth
##  Min.   : 1280   Min.   :  0.00   Min.   :  3.00   Min.   : 600000  
##  1st Qu.: 5120   1st Qu.: 70.00   1st Qu.: 45.00   1st Qu.:1000000  
##  Median : 5120   Median : 70.00   Median : 60.00   Median :1000000  
##  Mean   : 7191   Mean   : 65.36   Mean   : 98.76   Mean   :1054996  
##  3rd Qu.: 6400   3rd Qu.: 70.00   3rd Qu.:170.00   3rd Qu.:1059860  
##  Max.   :30720   Max.   :140.00   Max.   :255.00   Max.   :1640000  
##                                   NA's   :1                         
##    sp_attack        sp_defense       generation    is_legendary    
##  Min.   : 10.00   Min.   : 20.00   Min.   :1.00   Min.   :0.00000  
##  1st Qu.: 45.00   1st Qu.: 50.00   1st Qu.:2.00   1st Qu.:0.00000  
##  Median : 65.00   Median : 66.00   Median :4.00   Median :0.00000  
##  Mean   : 71.31   Mean   : 70.91   Mean   :3.69   Mean   :0.08739  
##  3rd Qu.: 91.00   3rd Qu.: 90.00   3rd Qu.:5.00   3rd Qu.:0.00000  
##  Max.   :194.00   Max.   :230.00   Max.   :7.00   Max.   :1.00000  
## 

PCA on Characteristics of Pokemons:

Variables from height_m to sp_defense are the quantitatives variables for the PCA.

Supplementary qualitative variable are generation and ‘is_legendary’.

##        eigenvalue percentage of variance cumulative percentage of variance
## comp 1  4.6406574              38.672145                          38.67214
## comp 2  1.3770799              11.475666                          50.14781
## comp 3  1.2355183              10.295986                          60.44380
## comp 4  0.8768248               7.306873                          67.75067

In our case, we are studying the 3 first dimensions (60.44% variance cumulative percentage with an eigenvalue >1)

Optimal number of clusters:

NbCLust will be used to determine the optimal number of clusters for the k means and for hierarchical clustering, bellow is the code:

##      2      3      4      5      6      7      8      9     10     11 
## 0.2238 0.2342 0.2334 0.2011 0.1504 0.1480 0.1147 0.1238 0.1206 0.1339 
##     12     13     14     15     16     17     18     19     20 
## 0.1273 0.1380 0.1335 0.1258 0.1206 0.1282 0.1303 0.1269 0.1289

In the following, 3 is considered the optimal cluster’s number with a silhouette index equal to 0.2342.

Clustering Using kmeans:

##     height_m   weight_kg         hp     attack    defense      speed
## 1  1.6998535  1.88177010  1.1646459  1.1354725  1.0251661  0.8160740
## 2  0.1316936  0.01270379  0.3344813  0.3523130  0.3339804  0.2742898
## 3 -0.5410006 -0.41989423 -0.6991946 -0.7169343 -0.6686642 -0.5436197
##   base_egg_steps base_happiness capture_rate experience_growth  sp_attack
## 1      2.6817483     -1.9642062   -0.9950961        1.20658511  1.2453287
## 2     -0.1924714      0.1936187   -0.5191537       -0.05644976  0.3248104
## 3     -0.3151650      0.1600267    0.9112943       -0.18235096 -0.7034575
##   sp_defense
## 1  1.1521201
## 2  0.3887621
## 3 -0.7695236